Analysis of a Very Large AltaVista Query Log

نویسندگان

  • Craig Silverstein
  • Monika Henzinger
  • Hannes Marais
  • Michael Moricz
چکیده

In this paper we present an analysis of a 280 GB AltaVista Search Engine query log consisting of approximately 1 billion entries for search requests over a period of six weeks. This represents approximately 285 million user sessions, each an attempt to fill a single information need. We present an analysis of individual queries, query duplication, and query sessions. Furthermore we present results of a correlation analysis of the log entries, studying the interaction of terms within queries. Our data supports the conjecture that web users differ significantly from the user assumed in the standard information retrieval literature. Specifically, we show that web users type in short queries, mostly look at the first 10 results only, and seldom modify the query. This suggests that traditional information retrieval techniques might not work well for answering web search requests. The correlation analysis showed that the most highly correlated items are constituents of phrases. This result indicates it may be useful for search engines to consider search terms as parts of phrases even if the user did not explicitly specify them as such.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Temporal Comparison of AltaVista Web

AltaVista, are essential tools in the quest to locate online information. This article reports research that used transaction log analysis to examine the characteristics and changes in AltaVista Web searching that occurred from 1998 to 2002. The research questions we examined are (1) What are the changes in AltaVista Web searching from 1998 to 2002? (2) What are the current characteristics of A...

متن کامل

Analysis of User query refinement behavior based on semantic features: user log analysis of Ganj database (IranDoc)

Background and Aim: Information systems cannot be well designed or developed without a clear understanding of needs of users, manner of their information seeking and evaluating. This research has been designed to analyze the Ganj (Iranian research institute of science and technology database) users’ query refinement behaviors via log analysis.    Methods: The method of this research is log anal...

متن کامل

Discovering Popular Clicks\' Pattern of Teen Users for Query Recommendation

Search engines are still the most important gates for information search in internet. In this regard, providing the best response in the shortest time possible to the user's request is still desired. Normally, search engines are designed for adults and few policies have been employed considering teen users. Teen users are more biased in clicking the results list than are adult users. This leads...

متن کامل

Web searching in Chinese: A study of a search engine in Hong Kong

been conducted on the query logs in search engines that are primarily English-based (e.g., Excite and AltaVista), only a few of them have studied the information-seeking behavior on the Web in non-English languages. In this article, we report the analysis of the search-query logs of a search engine that focused on Chinese. Three months of search-query logs of Timway, a search engine based in Ho...

متن کامل

EEQR: An Energy Efficient Query-Based Routing Protocol for Wireless Sensor Networks

Routing in Wireless Sensor Networks (WSNs) is a very challenging task due to the large number of nodes, their mobility and lack of proper infrastructure. Since the sensors are battery powered devices, energy efficiency is considered as one of the main factors in designing routing protocols in WSNs. Most of energy-aware routing protocols are mere energy savers that attempt to decrease the energy...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998